skip to main content


Search for: All records

Creators/Authors contains: "Yao, Huaxiu"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available May 1, 2024
  2. A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task. This paper shows that in such settings, selectively fine-tuning a subset of layers (which we term surgical fine-tuning) matches or outperforms commonly used fine-tuning approaches. Moreover, the type of distribution shift influences which subset is more effective to tune: for example, for image corruptions, fine-tuning only the first few layers works best. We validate our findings systematically across seven real-world data tasks spanning three types of distribution shifts. Theoretically, we prove that for two-layer neural networks in an idealized setting, first-layer tuning can outperform fine-tuning all layers. Intuitively, fine-tuning more parameters on a small target dataset can cause information learned during pre-training to be forgotten, and the relevant information depends on the type of shift. 
    more » « less
  3. null (Ed.)
    Web personalization, e.g., recommendation or relevance search, tailoring a service/product to accommodate specific online users, is becoming increasingly important. Inductive personalization aims to infer the relations between existing entities and unseen new ones, e.g., searching relevant authors for new papers or recommending new items to users. This problem, however, is challenging since most of recent studies focus on transductive problem for existing entities. In addition, despite some inductive learning approaches have been introduced recently, their performance is sub-optimal due to relatively simple and inflexible architectures for aggregating entity’s content. To this end, we propose the inductive contextual personalization (ICP) framework through contextual relation learning. Specifically, we first formulate the pairwise relations between entities with a ranking optimization scheme that employs neural aggregator to fuse entity’s heterogeneous contents. Next, we introduce a node embedding term to capture entity’s contextual relations, as a smoothness constraint over the prior ranking objective. Finally, the gradient descent procedure with adaptive negative sampling is employed to learn the model parameters. The learned model is capable of inferring the relations between existing entities and inductive ones. Thorough experiments demonstrate that ICP outperforms numerous baseline methods for two different applications, i.e., relevant author search and new item recommendation. 
    more » « less
  4. null (Ed.)
  5. Knowledge graphs (KGs) serve as useful resources for various natural language processing applications. Previous KG completion approaches require a large number of training instances (i.e., head-tail entity pairs) for every relation. The real case is that for most of the relations, very few entity pairs are available. Existing work of one-shot learning limits method generalizability for few-shot scenarios and does not fully use the supervisory information; however, few-shot KG completion has not been well studied yet. In this work, we propose a novel few-shot relation learning model (FSRL) that aims at discovering facts of new relations with few-shot references. FSRL can effectively capture knowledge from heterogeneous graph structure, aggregate representations of few-shot references, and match similar entity pairs of reference set for every relation. Extensive experiments on two public datasets demonstrate that FSRL outperforms the state-of-the-art. 
    more » « less
  6. Towards the challenging problem of semi-supervised node classification, there have been extensive studies. As a frontier, Graph Neural Networks (GNNs) have aroused great interest recently, which update the representation of each node by aggregating information of its neighbors. However, most GNNs have shallow layers with a limited receptive field and may not achieve satisfactory performance especially when the number of labeled nodes is quite small. To address this challenge, we innovatively propose a graph few-shot learning (GFL) algorithm that incorporates prior knowledge learned from auxiliary graphs to improve classification accuracy on the target graph. Specifically, a transferable metric space characterized by a node embedding and a graph-specific prototype embedding function is shared between auxiliary graphs and the target, facilitating the transfer of structural knowledge. Extensive experiments and ablation studies on four real-world graph datasets demonstrate the effectiveness of our proposed model and the contribution of each component. 
    more » « less